<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8">
	<meta http-equiv="content-style-type" content="text/css">
	<link href="pcpdoc.css" rel="stylesheet" type="text/css">
	<link href="images/pcp.ico" rel="icon" type="image/ico">
	<title>PCP Portability and QA</title>
</head>

<style>

table, th, td {
    border: 1px solid black;
    border-collapse: collapse;
    border-spacing: 2px;
    padding: 4px;
}

td {
    text-align: right;
}

code {
    color: #000060;
    font-size: 10pt;
    margin-left: 20px;
}

</style>

<body lang="en-AU" text="#000060" dir="LTR">
<table width="100%" border="0" cellpadding="0" cellspacing="0" style="page-break-before: always">
    <tr><td width="64" height="64"><a href="https://pcp.io/"><img src="images/pcpicon.png" alt="pcpicon" align="TOP" width="64" height="64" border="0"></a></td>
	<td width="1"><p>&nbsp;&nbsp;&nbsp;&nbsp;</p></td>
	<td width="500"><p align="LEFT"><a href="index.html"><font color="#cc0000">Home</font></a></p></td>
    </tr>
</table>

<h1 align="CENTER" style="margin-top: 0.48cm; margin-bottom: 0.32cm"><font size="6">
Portability Guidelines for PCP QA
</font></h1>
<p align="center"><font size="4">Ken McDonell &ltkenj@kenj.id.au&gt</font></p>
<p align="center"><font size="3"><i>September 2018</i></font></p>

<h2>Background</h2>

<p>In this document the generic term &quot;component&quot; will be used to describe a piece of
functionality in Performance Co-Pilot (PCP) that could be any of the following:</p>
<ul>
<li>a library</li>
<li>a method or module or class in a library</li>
<li>an application</li>
<li>a PMDA</li>
<li>a service or API</li>
<li>an object in the file system</li>
</ul>

<p>Obviously the functional goals and feature focus of PCP has been
to build tools that help understand hard performance problems that
are seen in some the the world's most complex systems.  Over more
than two decades, the delivery of components to address the feature
requirements has been build on engineering processes and approaches
with a technical focus on:</p>

<ul>
<li>
<b>Portability</b>

<p>From the outset we were concerned with making PCP work successfully
across a wide variety of environments.</p>

<p>The earliest PCP architectural deliberations began at Pyramid
Technology in 1993, but by this time I had already been working on
portable software for more than 15 years.  Then the PCP incubator moved
to SGI, and in the very early days of Linux, Mark Goodwin undertook
an &quot;IRIX to Linux&quot; port as a skunk works project within SGI
Engineering.  A little later, SGI's Clustered File System (CXFS) had
PCP components for IRIX, Windows, HPUX, AIX, MacOS and Linux, across
a variety of CPU architectures, C compilers and operating systems.</p>

<p>Although many of these have passed away, PCP survives and is
now actively supported on all Linux variants (most importantly the
Fedora/RPM, OpenSuSE/RPM and Debian/dpkg platforms and their respective
derivatives), the BSD family, OpenIndiana (<i>nee</i> Sun Solaris),
Mac OS X and Windows.</p>

<p>So there is a very rich history of engineering for portability
in the PCP DNA, and an expectation that new PCP components will be
&quot;robust&quot; (see below for a definition of &quot;robust&quot;)
across all of the supported platforms and environments.</p>

</li>
<li><b>Testability</b>

<p>The PCP project has never had a dedicated QA team, so by
necessity the QA model that was adopted (and at times enforced)
was one where if an engineer added a feature or fixed a bug, there
was an expectation that there would be additional QA coverage for the
associated changes.  Although this approach suffers from tunnel-vision
and common assumptions between development and testing, it does have
the advantage that testing and QA is a communal responsibility.</p>

<p>This effort grew into the very large QA infrastructure that lives
below the <i>qa</i> directory in the source tree and is shipped in
the <b>pcp-testsuite</b> package.  This represents a significant
engineering effort in its own right as the table below shows:</p>

<table>
    <tr><th>&nbsp;</th><th>below the src dir</th><th>below the qa dir</th></tr>
    <tr><td>C or C++</td><td>320,000</td><td>37,000</td></tr>
    <tr><td>shell</td><td>23,000</td><td>85,000</td></tr>
    <tr><td>perl</td><td>18,000</td><td>700</td></tr>
    <tr><td>python</td><td>17,000</td><td>200</td></tr>
<caption>Lines of Code</caption>
</table>

<p>The QA test suite comprises close to 1,100 test scripts.  The scripts
are used by individual developers to test code changes and check
for regressions.  In addition the entire suite of scripts is run
regularly across the 30+ machines in the Melbourne QA Farm.</p>

</li>
</ul>

<h2>Robustness</h2>

<p>Robustness simply means that every PCP application and service
either works correctly or detects that the environment will not support
correct execution and is either omitted from the build or omitted from
the packaging or reports a warning and exits in an orderly fashion.</p>

<p>Some examples may help illustrate this philosophy.</p>
<ul>

<li>The platform PMDAs (linux, windows, freebsd, openbsd, netbsd,
opensolaris, aix, ...) are only compiled and packaged if the build
is being run for the corresponding target platform.</li>

<li>Optional headers are checked for
using the <i>AC_CHECK_HEADERS</i>() macro in <a
href="https://github.com/performancecopilot/pcp/blob/master/configure.ac">configure.ac</a>
and <b>cpp</b> conditionals and macros are used to guard
the inclusion of these headers, e.g. &lt;strings.h&gt; in <a
href="https://github.com/performancecopilot/pcp/blob/master/src/dbpmda/src/util.c">src/dbpmda/src/util.c</a>.</li>

<li>In the Apple operating systems there used to be a
rpc.server.nqnfs.leases metric but this has been deprecated and is no
longer available.  The corresponding darwin PMDA uses PMAPI protocols
to return the type of this metric as <b>PM_TYPE_NOSUPPORT</b>
which all PCP clients are able to handle appropriately (either
ignore the metric or report an explanatory warning).  See <a
href="https://github.com/performancecopilot/pcp/blob/master/src/pmdas/darwin/pmda.c">src/pmdas/darwin/pmda.c</a></li>

<li>Every PCP QA test script completes with a status of pass, fail
or notrun.  The latter is used to indicate that the test either cannot
be run because some required component is missing or the test is not
appropriate for the current platform.  See the tests for the Python
<b>OrderedDict</b> module and the <b>pmrep</b> application in <a
href="https://github.com/performancecopilot/pcp/blob/master/qa/035">qa/035</a>.</li>

</ul>

<h2>Source Code and Compile Time Portability</h2>

<p>Mostly what's been done here is common and good engineering
practice.  For example using configure, conditionals in the
GNUmakefiles and assorted <b>sed</b>/<b>awk</b> rewriting scripts
to ensure the code will compile cleanly on all platforms.  Compiler
warnings are enabled and builds are expected to complete without
warnings.  And in the source code we demand thorough error checking
and error handling on all system and library calls.</p>

<p>We've extended the normal concept of macros to
include a set of globally defined names that are are used
for path and directory names, search paths, application
names and application options and arguments.  These are defined in <a
href="https://github.com/performancecopilot/pcp/blob/master/src/include/pcp.conf.in">src/include/pcp.conf.in</a>,
bound to values at build time by configure and installed
in /etc/pcp.conf.  These can then be used in
shell scripts and applications in C, C++, Perl, Python
have run-time access to these via <b>pmGetConfig</b>()
or <b>pmGetOptionalConfig</b>(), see for example <a
href="https://github.com/performancecopilot/pcp/blob/master/src/pmie/src/pmie.c">src/pmie/src/pmie.c</a>.</p>

<p>Even file pathname separators (/ for the sane world, \ elsewhere)
have been abstracted out and <b>pmPathSeparator</b>() is used to
construct pathnames from directory components.</p>

<p>At a higher level we don't even try to build code if it is not
intended for the target platform.</p>

<h2>Packaging Portability</h2>

<p>At packaging time we use conditionals to include only those
components that can be built and are expected to work for the target
platform.</p>

<p>This extends to wrapping some of the prerequisites in conditionals
if the prerequisite piece may not be present or may have a different
name.</p>

<p>For Debian packaging this means debian/control is build from <a
href="https://github.com/performancecopilot/pcp/blob/master/debian/control.master">debian/control.master</a>
and the ?{foo} elements of the <b>Build-Depends</b> and
<b>Depends</b> clauses are conditionally expanded by the <a
href="https://github.com/performancecopilot/pcp/blob/master/debian/fixcontrol.master">debian/fixcontrol.master</a>
script during the build.</p>

<p>For RPM packaging this means using configure to
perform macro substitution to create build/rpm/pcp.spec from <a
href="https://github.com/performancecopilot/pcp/blob/master/build/rpm/pcp.spec.in">build/rpm/pcp.spec.in</a>
and using <b>%if</b> within the spec file to selectively include packages
and adjust the <b>BuildRequires</b> and <b>Requires</b> clauses.</p>

<h2>QA Portability</h2>

<p>There are purpose-designed QA applications in the
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/src">qa/src</a>
directory
and the source code for these applications should follow all of the
same guidelines for portability and build resilience as those that
apply for the applications and libraries that ship with the main part
of PCP.
The only possible exception is that error handling can be a
little more relaxed for the QA applications as they are used in a more
controlled manner.</p>

<h3>Use /etc/pcp.conf</h3>
<p>All QA test scripts will have the variables set in /etc/pcp.conf
placed in the environment using $PCP_DIR/etc/pcp.env which is
called from
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/common.rc">common.rc</a>
(with $PCP_DIR typically unset).
And all QA tests source common.rc, although the sequence is a little
convoluted (and irrelevant for this discussion).</p>
<p>The important thing is that the following environment variables
become available to improve the portability of QA test scripts
and the <a href="#filtering">output filtering</a> (see below)
they must perform.</p>
<ul>
<li>A QA script should never use a literal pathname to a PCP
installed file or directory because there will be a $PCP_<something>
variable that can be used.  For example:<br>
<code>$ cd $PCP_PMDAS_DIR/mypmda</code><br>
is always correct, whereas<br>
<code>$ cd /var/lib/pcp/pmdas/mypmda</code><br>
may not work on some platforms.</li>
<li>The current platform is available via $PCP_PLATFORM.
The common values are <b>linux</b> (of any flavour), <b>freebsd</b>,
<b>netbsd</b>, <b>openbsd</b>, <b>darwin</b> (Mac OS X), <b>solaris</b>
(really OpenIndiana) and <b>mingw</b> (Windows).
This can be used to make high-level decisions about the behavior
of a QA test, see the <a href="#notrun">Not run</a> and
<a href="#alternate.out">Alternate .out files</a> sections below.
</li>
<li>If a QA tests depends on a particular version or versions of
PCP, then $PCP_VERSION can be tested.</li>
<li>Because there may be more than one version of a (non-PCP) command
installed or because we need special command line arguments to make
a (non-PCP) command behave in a standard manner, the following are
available:
<ul>
<li>$PCP_AWK_PROG Posix-compliant <b>awk</b></li>
<li>$PCP_SORT_PROG Unix-like <b>sort</b></li>
<li>$PCP_MAKE_PROG the preferred <b>gmake</b></li>
<li>$PCP_CC_PROG the preferred <b>gcc</b></li>
<li>$PCP_ECHO_PROG the standard echo and importantly the
prefix ($PCP_ECHO_N) and the suffix ($PCP_ECHO_C) to allow output
without a trailing newline.  These are important because<br>
<code>$ echo -n "Prompt? "</code><br>
is not portable, nor is<br>
<code>$ echo "Prompt? \c"</code><br>
but this portable and correct, albeit ugly!<br>
<code>$ $PCP_ECHO_PROG $PCP_ECHO_N "Prompt? $PCP_ECHO_C"</code><br>
</li>
<li>$PCP_PS_PROG the preferred <b>ps</b> and more importantly when
$PCP_PS_PROG is run with $PCP_PS_ALL_FLAGS as the argument we
get the fields and field-order that a QA test can rely upon.</li>
<li>$PCP_PYTHON_PROG the version of <b>python</b> that the installed PCP
components are expecting (critical for version 2 vs version 3
differences).</li>
</ul>
</ul>

<h3>Use the predefined variables</h3>
<p>PCP QA scripts follow a standard template when created with the
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/new">qa/new</a>
script (which is the recommended way to create new QA test scripts)
and then the following local shell variables are available:
<ul>
<li>$<b>seq</b>: the current test sequence number</li>
<li>$<b>tmp</b>: a private and unique path that maybe used to create
temporary files or a temporary directory all of which will be
removed when the script ends.  For example:<br>
<code>$ ... >$tmp.out 2>$tmp.err</code><br>
<code>$ mkdir $tmp</code><br>
</li>
<li>$<b>sudo</b>: run the <b>sudo</b> command with arguments that make it work
the same everywhere</li>
<li>$<b>DSO_SUFFIX</b>: if you need to refer to a shared
library by name, avoid the literal
"<b>.so</b>" suffix; instead use <b>.</b>$DSO_SUFFIX that will become
<b>.so</b> or <b>.dll</b> or <b>.dylib</b> or ... and will work across all
platforms.</li>
</ul>

<h3>check-vm is your friend, not the enemy</h3>
<p>There is a a very large set of applications and packages
outside of PCP
that are required to build PCP from source and/or run PCP QA.
The script
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/admin/check-vm">qa/admin/check-vm</a>
tries to capture all of these requirements.
<b>check-vm</b> should be used as follows:
<ul>
<li>Before starting a new QA cycle, running this script with
a <b>-p</b> argument will identify any packages that you should consider
installing.  Without the <b>-p</b> you'll see more details of what's
needed and why.</li>
<li>If you are a developer and add a new component, please make
sure that any additional dependencies for you new component have been
captured in qa/admin/check-vm.</li>
</ul></p>

<h3>Avoid exotic commands and non-standard command arguments</h3>

<p>Our QA tests scripts are run with <b>sh</b> not <b>bash</b> and on
some platforms these are really different!</p>

<p>Things like an <b>==</b> operand
used with
<b>test</b> (aka <b>[</b>) as in<br>
<code>$ if [ "$foo" == "bar" ] ...</code><br>
will not work with the Bourne shell.
Instead, use the <b>=</b> operator, e.g.<br>
<code>$ sh</code><br>
<code>$ x=1</code><br>
<code>$ [ "$x" = 1 ] && echo yippee</code><br>
<code>yippee</code><br>
<code>$ [ "$x" == 1 ] && echo boo</code><br>
<code>sh: 3: [: 1: unexpected operator</code></p>

<p>Also and any use of the <b>bash</b> <b>[[</b> operator<br>
<code>$ ... [[ ... ]]</code><br>
is going to blow up when presented to a real Bourne shell.
In most cases <b>test</b> (aka <b>[</b>) can be used for a
straight forward rewrite.</p>

<p>Even less portable is any use of the <b>bash</b> <b>$((...))</b>
construct for in-built arithmetic.
For these cases, you'll need equivalent logic using <b>expr</b>, e.g.<br>
instead of<br>
<code>x=$(($x+1))</code><br>
use<br>
<code>x=&#96;expr $x + 1&#96;</code></p>

<p>Another recurring one is the <b>-i</b> argument for <b>sed</b>;
this is not standard and not supported everywhere so just do not use
it. The alternative:<br>
<code>$ sed ... &lt;somefile &gt;$tmp.tmp; mv $tmp.tmp somefile</code><br>
works everywhere for most of the cases in QA.
If cross-filesystem linking and a lame <b>mv</b> are in play then the
following is even more portable:<br>
<code>$ sed ... &lt;somefile &gt;$tmp.tmp; cp $tmp.tmp somefile; rm $tmp.tmp</code><br>
If permissions are in play, then you may need:<br>
<code>$ sed ... &lt;somefile &gt;$tmp.tmp; $sudo cp $tmp.tmp somefile; rm $tmp.tmp</code></p>
<p>Do not use <b>seq</b> as it is not portable.
For example, this is not portable:<br>
<code>for i in $(seq 4); do ...; done</code><br>
but it can be rewritten as follows and this will work everywhere:<br>
<code>i=1; while [ $i -le 4 ]; do ...; i=`expr $i + 1`; done</code><br>
</p>

<h3><a name="filtering"></a>Output filtering</h3>
<p>Each QA test script must produce a standard output stream that
is deterministic.
The
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/check">qa/check</a>
script that is responsible for running each QA test script (say <i>NNNN</i>)
captures the standard output from <i>NNNN</i> and if this matches the expected
result in <i>NNNN.out</i> then then test is deemed to have passed, otherwise
the standard output from <i>NNNN</i> is saved in <i>NNNN.out.bad</i> and the
test is deemed to have failed.</p>
<p>So it is critical that output of <i>NNNN</i> is deterministic across
all platforms and timezones.</p>
<p>Applications used in a QA test script will potentially
produce output that is irrelevant
to the success or otherwise of the test.
The most obvious example is the current date from an application
run on the test system.
Consequently, most QA test scripts include one or more <b>_filter</b>() functions
that are responsible to translating the raw output from the applications
run in the test into the deterministic output.</p>
<p>These filtering functions in turn often use a set of standard functions in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/common.filter">qa/common.filter</a>
that may be applied to the output of common PCP commands and operations.</p>
<p>There are way too many filtering cases to describe them all here, but the list below
is illustrative of the range of techniques that may be needed.
And each QA test script may need employ more than one of the techniques
to produce deterministic output for &quot;correct&quot;
execution across all platforms.</p>
<ul>
<li>
For many operations involving starting or stopping PCP daemons and for
common output formats from PCP commands there are existing filtering
functions in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/common.filter">qa/common.filter</a>
that may be used, for example:
<ul>
<li><b>_filter_pcp_start</b>() filters the output from running the <b>pcp</b>
&quot;init&quot; script that starts <b>pmcd</b> and <b>pmlogger</b>.
May also be used for the separate <b>pmcd</b> and <b>pmlogger</b>
&quot;init&quot; scripts that start just <b>pmcd</b> and <b>pmlogger</b>
respectively.</li>
<li><b>_filter_pcp_stop</b>() filters the output from running the <b>pcp</b>
&quot;init&quot; script that stops <b>pmcd</b> and <b>pmlogger</b>.
May also be used for the separate <b>pmcd</b> and <b>pmlogger</b>
&quot;init&quot; scripts that stop just <b>pmcd</b> and <b>pmlogger</b>
<li><b>_filter_pmie_start</b>() filters the output from running the <b>pmie</b>
&quot;init&quot; script that starts <b>pmie</b>.</li>
<li><b>_filter_pmda_install</b>() filters the output from installing a
new PMDA.</li>
<li><b>_filter_pmcd_log</b>() filters the log file from <b>pmcd</b>.</li>
<li><b>_filter_pmlogger_log</b>() filters the log file from <b>pmlogger</b>.</li>
<li><b>_filter_pmdumplog_log</b>() filters the log file from <b>pmdumplog</b>.</li>
<li><b>_filter_pmdumptext_log</b>() filters the log file from <b>pmdumptext</b>.</li>
<li><b>_filter_optional_pmdas</b>() removes lines containing the names
of executable PMDAs that may, or may not, be installed.</li>
<li><b>_filter_top_pmns</b>() removes lines containing the just the
top-level PMNS names for PMDAs that may, or may not, be installed.</li>
<li><b>_filter_pmie_log</b>() filters the log file from <b>pmie</b>.</li>
<li><b>_filter_dumpresult</b>() filters the diagnostic output from
<b>pmDumpResult</b>() (translate timestamp to a literal <b>TIMESTAMP</b>,
remove data addresses, replace metric values by literals
like <b>NUMBER</b>, <b>HEXNUMBER</b>, <b>STRING</b> or
<b>AGGREGATE</b>).</li>
<li><b>_filter_dbg</b>() strips timestamps from some of the diagnostics
generated by the <b>-D</b> command line option to PCP commands.</li>
</ul>
</li>
<li>Local date/time filtering.  Typically the current date and time is
not critical to the success of the test, so replacing the actual date and/or
time by a constant is common filtering operation.  See for example DATE in
_filter() in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/866">qa/866</a>, or
TIMESTAMP in <b>_filter__pmdumplog</b>() in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/common.filter">qa/common.filter</a>
</li>
<li>Another approach to ensuring deterministic output for tests that
use PCP archives is to use the <b>-z</b> command line argument to most PCP commands
so that any date and time reporting uses the timezone of the archive
rather than the timezone of the machine where the test is being run.
See for example
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/1072">qa/1072</a>.
</li>
<li>Remote host name, IP addresses, process ids (pids) and port numbers
are all examples of data that is likely to vary across the platforms
where the QA test scripts are run and so are candidates for filtering.
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/100">qa/100</a>
includes several examples of this type of filtering.
</li>
<li>For some QA tests scripts the &quot;correct&quot; information
is confined to one or more fields or columns.  If the <i>other</i> fields
and columns are not deterministic, then vertical filtering may be
appropriate.
For an example using <b>cut</b> see
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/951">qa/951</a>
while
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/957">qa/957</a>
shows how vertical filtering can be achieved using <b>awk</b>.
</li>
<li>Similarly, sometimes the information that is critical to
the outcome of a QA test may be in only some of the output lines
while the other lines are not constant across platforms.
Horizontal filtering may be pattern-based (e.g. using <b>grep</b> or <b>sed</b>)
or ordinal line number based (e.g. using <b>sed</b> or <b>awk</b> to remove or
keep lines N-M) or FSA-based (e.g. using <b>awk</b> to skip lines until
the first line of interest is found, then output the following lines
until the last line of interest is found).</li>
<li>Gratuitous white space differences for the same command across different
platforms is another common scenario.
See for example <b>od</b> output filtering in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/079">qa/079</a>,
or <b>wc</b> output filtering in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/1059">qa/1059</a>.
</li>
<li>For some tests, a value may be deemed &quot;correct&quot;
if it is in an expected range.
In these cases filtering is required to map the actual value onto
a constant literal value.  For an example, see the <b>_range</b>() function in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/178">qa/178</a>.
</li>
</ul>
<p>To the extent that filtering removes or rewrites what would otherwise
be the output of a QA test script, there maybe some difficulty in triaging
test failures if you have only the expected <i>NNNN.out</i> and the observed
<i>NNNN.out.bad</i> files.
Most QA test scripts will emit unfiltered output and diagnostics to aid
triage and
these are written to a <i>NNNN.full</i> file which is retained for inspection
after the test has been run.</p>

<h3><a name="notrun"></a>Not run</h3>
<p>If QA test script <i>NNNN</i> decides it cannot or should not be run on the
current platform, it should create a <i>NNNN.notrun</i> file.
Optionally <i>NNNN.notrun</i> contains the reason the test has not been run.
</p>
<p>The convenience function <b>_notrun</b>() defined in 
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/common.check">qa/common.check</a>
may be used to create the file, write the reason to the file and exit.
</p>
<p>Some common reasons for a test to be &quot;not run&quot; are:
<ul>
<li>the test is platform-specific and makes no sense on other platforms</li>
<li>an optional PMDA that is required for the QA test script is not installed</li>
<li>a command or application or service that is required for the QA test
script is not installed or cannot be found</li>
<li>a Python or Perl module that is required for the QA test script is not installed</li>
<li>the QA test script requires a kernel, library, application or service
at some minimum release or version level</li>
</ul>
</p>
<h3>PMNS ordering</h3>
<p>If a QA test script uses an application that traverses the PMNS
for a non-leaf metric name (usually from the command line or a configuration
file)
then depending on the PMDA(s) involved,
there is a chance that the names in the PMNS may be processed
in different orders on different platforms.</p>
<p>The simplest solution is to enumerate the PMNS nodes of interest
in the QA script using <b>pminfo</b>, sort the list and then have the
application being tested operate on the leaf nodes of the PMNS one
at a time.
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/647">qa/647</a>
illustrates an example of this approach.</p>

<h3>Instance ordering</h3>
<p>Some PMDAs maintain their instance domains in a manner
that may present instances in non-deterministic order, so while
all instances may be present on all platforms, the sequence of the
instances within a instance domain or in a pmResult is not the same
everywhere.</p>
<p>The QA helper application <i>$here/src/sortinst</i> (source in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/src/sortinst.c">qa/src/sortinst.c</a>
may be use to re-order the reported instances so the sequence is
deterministic.
See
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/1180">qa/1180</a>
for an example that uses <i>$here/src/sortinst</i> to re-order <b>pminfo</b> <b>-</b>f output.
</p>

<h3>Sort ordering</h3>
<p>Unfortunately <b>sort</b> does not produce the same sorted
order on all platforms by default.
We need to take control of <b>LC_COLLATE</b> and the decision has been made
to standardize on <b>POSIX</b> as the collating sequence for PCP QA.</p>
<p><b>$LC_COLLATE</b> is set and exported into the environment in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/common.rc">common.rc</a>,
but this is a relatively recent change so <b>LC_COLLATE=POSIX</b> is liberally
sprinkled throughout the QA test suite ahead of running <b>sort</b>.
</p>

<h3>Filesystem directory order</h3>
<p>Some PMDAs use <b>readdir</b>() or similar routines to scan a directory
contents to find metrics and/or instances.
This practice is certain to expose platform differences as the order
of directory entries is unlikely to be deterministic.</p>
<p>Judicious use of sort will be required.  See
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/496">qa/496</a>
for a typical example of how this should be done.

<h3><a name="alternate.out"></a>Alternate .out files</h3>
<p>For some PMDAs and/or some QA test scripts,
no amount of clever engineering will hide the fact
that a &quot;correct&quot; test execution on one platform may produce different
output to &quot;correct&quot; test execution on another platform.
In these cases the simplest choice is to <b>not</b> have a single <i>NNNN.out</i>
file, but rather have a set of them and the QA test script chooses the
most appropriate one and links this to <i>NNNN.out</i> when it starts.</p>

<p>Alternate <i>NNNN.out</i> files may be required in the following
types of cases:
<ul>
<li>PMNS differences (especially for the kernel PMDA), e.g
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/012">qa/012</a></li>
<li>Base service differences, e.g. IPv6 may, or may not, be present for
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/365">qa/365</a></li>
<li>Hashing differences</li>
<li>Different word size may mean different underlying metric data type
differences, e.g. in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/926">qa/926</a></li>
<li>Endianness differences make some dump outputs completely different,
e.g. in
<a href="https://github.com/performancecopilot/pcp/blob/master/qa/422">qa/422</a></li>
</ul>
</p>

</body>
